cyber defense
Daybreak is OpenAI's response to Anthropic's Claude Mythos
OpenAI has just launched Daybreak, a cybersecurity initiative that's clearly the company's competitor to Anthropic's Project Glasswing . If you'll recall, Glasswing uses Anthropic's unreleased AI model, Claude Mythos Preview, to provide its clients' cyber defense needs. It's been promising, so far: Mozilla revealed in April that Mythos helped it find and patch 271 vulnerabilities in the latest release of the Firefox browser. OpenAI says Daybreak uses its various AI models, including its specialized security agent Codex. In its announcement, the company explained that Daybreak is built around the premise that cyber defense should be built into software from the start and not just revolve around finding and fixing vulnerabilities.
Tapas Are Free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments
Hu, Jinwei, Dong, Yi, Sun, Youcheng, Huang, Xiaowei
Autonomous agents in safety-critical applications must continuously adapt to dynamic conditions without compromising performance and reliability. This work introduces TAPA (Training-free Adaptation of Programmatic Agents), a novel framework that positions large language models (LLMs) as intelligent moderators of the symbolic action space. Unlike prior programmatic agents typically generate a monolithic policy program or rely on fixed symbolic action sets, TAPA synthesizes and adapts modular programs for individual high-level actions, referred to as logical primitives. By decoupling strategic intent from execution, TAPA enables meta-agents to operate over an abstract, interpretable action space while the LLM dynamically generates, composes, and refines symbolic programs tailored to each primitive. Extensive experiments across cybersecurity and swarm intelligence domains validate TAPA's effectiveness. In autonomous DDoS defense scenarios, TAPA achieves 77.7% network uptime while maintaining near-perfect detection accuracy in unknown dynamic environments. In swarm intelligence formation control under environmental and adversarial disturbances, TAPA consistently preserves consensus at runtime where baseline methods fail. This work promotes a paradigm shift for autonomous system design in evolving environments, from policy adaptation to dynamic action adaptation.
Security Logs to ATT&CK Insights: Leveraging LLMs for High-Level Threat Understanding and Cognitive Trait Inference
Hans, Soham, Marsella, Stacy, Hirschmann, Sophia, Gurney, Nikolos
Understanding adversarial behavior in cybersecurity has traditionally relied on high-level intelligence reports and manual interpretation of attack chains. However, real-time defense requires the ability to infer attacker intent and cognitive strategy directly from low-level system telemetry such as intrusion detection system (IDS) logs. In this paper, we propose a novel framework that leverages large language models (LLMs) to analyze Suricata IDS logs and infer attacker actions in terms of MITRE ATT&CK techniques. Our approach is grounded in the hypothesis that attacker behavior reflects underlying cognitive biases such as loss aversion, risk tolerance, or goal persistence that can be extracted and modeled through careful observation of log sequences. This lays the groundwork for future work on behaviorally adaptive cyber defense and cognitive trait inference. We develop a strategy-driven prompt system to segment large amounts of network logs data into distinct behavioral phases in a highly efficient manner, enabling the LLM to associate each phase with likely techniques and underlying cognitive motives. By mapping network-layer events to high-level attacker strategies, our method reveals how behavioral signals such as tool switching, protocol transitions, or pivot patterns correspond to psychologically meaningful decision points. The results demonstrate that LLMs can bridge the semantic gap between packet-level logs and strategic intent, offering a pathway toward cognitive-adaptive cyber defense. Keywords: Cognitive Cybersecurity, Large Language Models (LLMs), Cyberpsychology, Intrusion Detection Systems (IDS), MITRE ATT&CK, Cognitive Biases
PACEbench: A Framework for Evaluating Practical AI Cyber-Exploitation Capabilities
Liu, Zicheng, Huang, Lige, Zhang, Jie, Liu, Dongrui, Tian, Yuan, Shao, Jing
For instance, while several models can exploit CVE-2023-50564 in the isolated A-CVE setting, none succeed in the corresponding B-CVE environment where the vulnerable target is blended with benign hosts (BN 4 challenge). The C-CVE scenarios, which simulate more realistic penetration tests with multi-host dependencies, present an even greater challenge. As shown in Table 1, model performance drops further in these scenarios, with agents often completing only intermediate steps rather than the full end-to-end attack. For example, in the Chain 1 challenge, agents manage to compromise the initial perimeter server but fail in the subsequent phases of lateral movement, privilege escalation, or internal target discovery, thus failing to complete the full attack chain. Current model could not bypass the deployed cyber defenses. As shown in Table 1, every model score zero in the D-CVE scenarios, suggesting that no agent could autonomously discover a bypass for any of the up-to-date W AFs. This finding is particularly significant, as it indicates that current model capabilities have not yet crossed a key "safety red line" (red-lines.ai,
Agentic AI and the Cyber Arms Race
Oesch, Sean, Hutchins, Jack, Austria, Phillipe, Chaulagain, Amul
Abstract---Agentic AI is shifting the cybersecurity landscape as attackers and defenders leverage AI agents to augment humans and automate common tasks. In this article, we examine the implications for cyber warfare and global politics as Agentic AI becomes more powerful and enables the broad proliferation of capabilities only available to the most well resourced actors today . As attacks increased in volume and attackers became more sophisticated, moving towards polymorphic malware, packers, and novel evasion techniques, defenders looked to machine learning to provide scalability (quickly analyze large volumes of data and automate repetitive tasks), pattern recognition (detect common attack patterns), and novelty detection (recognize abnormal behaviors that may indicate malicious actors or insider threats). Companies now use Large Language Models (LLMs) to provide analysts and reverse engineers with a rapid analysis of malicious code and best next steps when triaging alerts. But the real paradigm shift in cybersecurity for both attackers and defenders is still on the horizon: agentic artificial intelligence (agentic AI).
IRSKG: Unified Intrusion Response System Knowledge Graph Ontology for Cyber Defense
Panigrahi, Damodar, Mitra, Shaswata, Neupane, Subash, Mittal, Sudip, Blakely, Benjamin A.
Cyberattacks are becoming increasingly difficult to detect and prevent due to their sophistication. In response, Autonomous Intelligent Cyber-defense Agents (AICAs) are emerging as crucial solutions. One prominent AICA agent is the Intrusion Response System (IRS), which is critical for mitigating threats after detection. IRS uses several Tactics, Techniques, and Procedures (TTPs) to mitigate attacks and restore the infrastructure to normal operations. Continuous monitoring of the enterprise infrastructure is an essential TTP the IRS uses. However, each system serves different purposes to meet operational needs. Integrating these disparate sources for continuous monitoring increases pre-processing complexity and limits automation, eventually prolonging critical response time for attackers to exploit. We propose a unified IRS Knowledge Graph ontology (IRSKG) that streamlines the onboarding of new enterprise systems as a source for the AICAs. Our ontology can capture system monitoring logs and supplemental data, such as a rules repository containing the administrator-defined policies to dictate the IRS responses. Besides, our ontology permits us to incorporate dynamic changes to adapt to the evolving cyber-threat landscape. This robust yet concise design allows machine learning models to train effectively and recover a compromised system to its desired state autonomously with explainability.
The Path To Autonomous Cyber Defense
Oesch, Sean, Austria, Phillipe, Chaulagain, Amul, Weber, Brian, Watson, Cory, Dixson, Matthew, Sadovnik, Amir
Abstract---Defenders are overwhelmed by the number and scale of attacks against their networks.This problem will only be exacerbated as attackers leverage artificial intelligence to automate their workflows. We propose a path to autonomous cyber agents able to augment defenders by automating critical steps in the cyber defense life cycle. To avoid being overwhelmed, and complexity. The deep neural nets in order to generalize well across creation of autonomous cyber defense agents is one states. By leveraging deep RL, DeepMind has trained promising approach to automate operations and prevent reinforcement learning algorithms to defeat expert human cyber defenders from being overwhelmed.
Japan embraces AI to boost cyberdefense, fight disinformation
AI researcher Connor Leahy advises whether humans should be fearful of artificial intelligence and where the technology is expected to be in the future. TOKYO -- The Japanese government announced last week that it will be adding 23 new technologies, including specific AI technologies, to its list of "Specified Key Technologies," according to the Cabinet Office's website. This designation means that the government will fund public and private research institutions to develop AI technologies for "active cyber defense" to prevent cyberattacks and technologies that can be used for detecting disinformation. The newly added technologies cover four areas, including cyberspace, maritime, aerospace and biotechnology. This is part of the government's Fostering Key Technologies for Economic Security strategy under the Economic Security Promotion Act.
AI Will Be a Double-Edged Sword in Future Cyber Conflicts
"Artificial Intelligence and machine learning โฆ [are] foundational to the future of cybersecurity. We have got to work our way through how we're going to deal with this. It is not the if, it's only the when to me," Adm. Mike Rogers, former chief of the National Security Agency and U.S. Cyber Command, remarked in an interview. During his presidency, Barack Obama shared his concerns about an attacker using artificial intelligence (AI) to access launch codes for nuclear weapons. "If that's its only job, if it's self-teaching and it's just a really effective algorithm, then you've got problems," Obama said.
How AI is shaping the cybersecurity arms race
The average business receives 10,000 alerts every day from the various software tools it uses to monitor for intruders, malware and other threats. Cybersecurity staff often find themselves inundated with data they need to sort through to manage their cyber defenses. Cyberattacks are increasing and affect thousands of organizations and millions of people in the U.S. alone. These challenges underscore the need for better ways to stem the tide of cyber-breaches. Artificial intelligence is particularly well suited to finding patterns in huge amounts of data.